7 research outputs found

    Anålise de dados científicos baseada em algoritmos de indexação bitmap

    Get PDF
    Computer simulations in large-scale often consume and produce a large volume of raw data files, which can be presented in different formats. Users usually need to analyze domain-specific data based on data elements related through multiple files generated along the computer simulation execution. Different existing solutions, like FastBit and NoDB, intend to support this analysis by indexing raw data in order to allow direct access to specific elements in raw data files regions of interest. However, those solutions are limited to analyze a single raw data file at once, while they are used only after computer simulation execution. The ARMFUL architecture proposes a solution capable of guarantee dataflow management, record related raw data elements in a provenance database and combine techniques of raw data file analysis at runtime. Through a data model that supports integration between computer simulation execution data and domain data, the architecture allows for queries on data elements related by multiple files. This dissertation proposes the implementation of instances of raw data indexing and query processor components presented by ARMFUL architecture, aiming to reduce the elapsed time of data ingestion in the provenance database and support raw data exploratory analysis.As simulaçÔes computacionais de larga escala usualmente consomem e produzem grandes volumes de arquivos de dados cientĂ­ficos, os quais podem apresentar diferentes formatos. Os usuĂĄrios, por sua vez, comumente necessitam analisar dados especĂ­ficos de domĂ­nio baseados em elementos de dados relacionados por meio de mĂșltiplos arquivos gerados ao longo da execução de simulaçÔes computacionais. Diferentes soluçÔes existentes, como o FastBit e o NoDB, buscam apoiar esta anĂĄlise por meio da indexação de dados cientĂ­ficos de forma a permitir o acesso direto a elementos especĂ­ficos de regiĂ”es de interesse em arquivos de dados cientĂ­ficos. Entretanto, tais soluçÔes sĂŁo limitadas a analisar um Ășnico arquivo de dados cientĂ­ficos por vez, ao passo que sĂŁo utilizadas apenas apĂłs a execução de simulaçÔes computacionais. A arquitetura ARMFUL propĂ”e uma solução capaz de garantir a gerĂȘncia do fluxo de dados, registrar elementos de dados cientĂ­ficos relacionados em uma base de proveniĂȘncia e combinar tĂ©cnicas de anĂĄlise de arquivos de dados cientĂ­ficos em tempo de execução. A partir de um modelo de dados que apoia a integração de dados de execução da simulação computacional e dados de domĂ­nio, a arquitetura permite consultas a elementos de dados relacionados por mĂșltiplos arquivos. Esta dissertação propĂ”e a implementação de instĂąncias dos componentes de indexação de dados cientĂ­ficos e de processamento de consultas presentes na arquitetura ARMFUL, buscando reduzir o tempo total de ingestĂŁo de dados na base de proveniĂȘncia e apoiar a anĂĄlise exploratĂłria de dados cientĂ­ficos

    Brazilian Flora 2020: Leveraging the power of a collaborative scientific network

    No full text
    International audienceThe shortage of reliable primary taxonomic data limits the description of biological taxa and the understanding of biodiversity patterns and processes, complicating biogeographical, ecological, and evolutionary studies. This deficit creates a significant taxonomic impediment to biodiversity research and conservation planning. The taxonomic impediment and the biodiversity crisis are widely recognized, highlighting the urgent need for reliable taxonomic data. Over the past decade, numerous countries worldwide have devoted considerable effort to Target 1 of the Global Strategy for Plant Conservation (GSPC), which called for the preparation of a working list of all known plant species by 2010 and an online world Flora by 2020. Brazil is a megadiverse country, home to more of the world's known plant species than any other country. Despite that, Flora Brasiliensis, concluded in 1906, was the last comprehensive treatment of the Brazilian flora. The lack of accurate estimates of the number of species of algae, fungi, and plants occurring in Brazil contributes to the prevailing taxonomic impediment and delays progress towards the GSPC targets. Over the past 12 years, a legion of taxonomists motivated to meet Target 1 of the GSPC, worked together to gather and integrate knowledge on the algal, plant, and fungal diversity of Brazil. Overall, a team of about 980 taxonomists joined efforts in a highly collaborative project that used cybertaxonomy to prepare an updated Flora of Brazil, showing the power of scientific collaboration to reach ambitious goals. This paper presents an overview of the Brazilian Flora 2020 and provides taxonomic and spatial updates on the algae, fungi, and plants found in one of the world's most biodiverse countries. We further identify collection gaps and summarize future goals that extend beyond 2020. Our results show that Brazil is home to 46,975 native species of algae, fungi, and plants, of which 19,669 are endemic to the country. The data compiled to date suggests that the Atlantic Rainforest might be the most diverse Brazilian domain for all plant groups except gymnosperms, which are most diverse in the Amazon. However, scientific knowledge of Brazilian diversity is still unequally distributed, with the Atlantic Rainforest and the Cerrado being the most intensively sampled and studied biomes in the country. In times of “scientific reductionism”, with botanical and mycological sciences suffering pervasive depreciation in recent decades, the first online Flora of Brazil 2020 significantly enhanced the quality and quantity of taxonomic data available for algae, fungi, and plants from Brazil. This project also made all the information freely available online, providing a firm foundation for future research and for the management, conservation, and sustainable use of the Brazilian funga and flora

    Reconstruction of interactions in the ProtoDUNE-SP detector with Pandora

    No full text
    International audienceThe Pandora Software Development Kit and algorithm libraries provide pattern-recognition logic essential to the reconstruction of particle interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at ProtoDUNE-SP, a prototype for the Deep Underground Neutrino Experiment far detector. ProtoDUNE-SP, located at CERN, is exposed to a charged-particle test beam. This paper gives an overview of the Pandora reconstruction algorithms and how they have been tailored for use at ProtoDUNE-SP. In complex events with numerous cosmic-ray and beam background particles, the simulated reconstruction and identification efficiency for triggered test-beam particles is above 80% for the majority of particle type and beam momentum combinations. Specifically, simulated 1 GeV/cc charged pions and protons are correctly reconstructed and identified with efficiencies of 86.1±0.6\pm0.6% and 84.1±0.6\pm0.6%, respectively. The efficiencies measured for test-beam data are shown to be within 5% of those predicted by the simulation

    Reconstruction of interactions in the ProtoDUNE-SP detector with Pandora

    No full text
    International audienceThe Pandora Software Development Kit and algorithm libraries provide pattern-recognition logic essential to the reconstruction of particle interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at ProtoDUNE-SP, a prototype for the Deep Underground Neutrino Experiment far detector. ProtoDUNE-SP, located at CERN, is exposed to a charged-particle test beam. This paper gives an overview of the Pandora reconstruction algorithms and how they have been tailored for use at ProtoDUNE-SP. In complex events with numerous cosmic-ray and beam background particles, the simulated reconstruction and identification efficiency for triggered test-beam particles is above 80% for the majority of particle type and beam momentum combinations. Specifically, simulated 1 GeV/cc charged pions and protons are correctly reconstructed and identified with efficiencies of 86.1±0.6\pm0.6% and 84.1±0.6\pm0.6%, respectively. The efficiencies measured for test-beam data are shown to be within 5% of those predicted by the simulation

    Reconstruction of interactions in the ProtoDUNE-SP detector with Pandora

    No full text
    International audienceThe Pandora Software Development Kit and algorithm libraries provide pattern-recognition logic essential to the reconstruction of particle interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at ProtoDUNE-SP, a prototype for the Deep Underground Neutrino Experiment far detector. ProtoDUNE-SP, located at CERN, is exposed to a charged-particle test beam. This paper gives an overview of the Pandora reconstruction algorithms and how they have been tailored for use at ProtoDUNE-SP. In complex events with numerous cosmic-ray and beam background particles, the simulated reconstruction and identification efficiency for triggered test-beam particles is above 80% for the majority of particle type and beam momentum combinations. Specifically, simulated 1 GeV/cc charged pions and protons are correctly reconstructed and identified with efficiencies of 86.1±0.6\pm0.6% and 84.1±0.6\pm0.6%, respectively. The efficiencies measured for test-beam data are shown to be within 5% of those predicted by the simulation

    Reconstruction of interactions in the ProtoDUNE-SP detector with Pandora

    No full text
    International audienceThe Pandora Software Development Kit and algorithm libraries provide pattern-recognition logic essential to the reconstruction of particle interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at ProtoDUNE-SP, a prototype for the Deep Underground Neutrino Experiment far detector. ProtoDUNE-SP, located at CERN, is exposed to a charged-particle test beam. This paper gives an overview of the Pandora reconstruction algorithms and how they have been tailored for use at ProtoDUNE-SP. In complex events with numerous cosmic-ray and beam background particles, the simulated reconstruction and identification efficiency for triggered test-beam particles is above 80% for the majority of particle type and beam momentum combinations. Specifically, simulated 1 GeV/cc charged pions and protons are correctly reconstructed and identified with efficiencies of 86.1±0.6\pm0.6% and 84.1±0.6\pm0.6%, respectively. The efficiencies measured for test-beam data are shown to be within 5% of those predicted by the simulation
    corecore